Allow passing in DistillationConfig directly to setup fn #399

AAnoosheh · 2025-10-02T18:03:11Z

What does this PR do?

Type of change: ? Minor tweak

Overview: ?

Usage

# Add a code snippet demonstrating how to use this

Testing

Before your PR is "Ready for review"

Make sure you read and follow Contributor guidelines and your commits are signed.
Is this change backward compatible?: Y
Did you write any new necessary tests?: N
Did you add or update any necessary documentation?: N
Did you update Changelog?: N

Additional Information

Summary by CodeRabbit

New Features
- More flexible distillation configuration: you can now provide a config object directly, a file path, or omit it to use sensible defaults.
Refactor
- Unified the distillation configuration setup into a single, streamlined API for simpler workflows.
- The previous configuration loader was replaced; projects may need to update calls to use the new API.

Signed-off-by: Asha Anoosheh <[email protected]>

coderabbitai · 2025-10-02T18:03:19Z

Walkthrough

Renames and reworks the distillation config loader to accept a path, a DistillationConfig instance, or None. Implements branching: None yields a default config, a provided DistillationConfig is used directly, or a YAML path is loaded and parsed. Updates docstrings and replaces the old function.

Changes

Cohort / File(s)	Summary
Distillation config API update `modelopt/torch/distill/plugins/megatron.py`	Replaced `load_distillation_config(...)` with `setup_distillation_config(config_or_path, student_cfg, teacher_cfg)`. New input contract supports None (default), direct `DistillationConfig`, or YAML path. Updated branching and docstrings; return type remains `DistillationConfig`.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant C as Caller
  participant M as setup_distillation_config
  participant Y as YAML Loader

  C->>M: setup_distillation_config(config_or_path, student_cfg, teacher_cfg)

  alt config_or_path is None
    M-->>C: Default DistillationConfig
  else config_or_path is DistillationConfig
    M-->>C: Provided DistillationConfig (as-is)
  else config_or_path is str (path)
    M->>Y: Load & parse YAML
    Y-->>M: Parsed settings
    M-->>C: DistillationConfig from YAML
  end

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Poem

I hop through configs, nimble and bright,
Paths, objects, or nothing—still set it right.
Three doors to distill, one flask to fill,
YAML or default, I bounce at will.
Tap-tap my paws—new name, same thrill. 🐇✨

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title succinctly and accurately summarizes the key change of allowing a DistillationConfig instance to be passed directly to the setup function, matching the main modification in the changeset.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✨ Finishing touches

📝 Generate Docstrings

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch aanoosheh/kd-direct-config

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (2)

modelopt/torch/distill/plugins/megatron.py (2)

100-109: Clarify the docstring for the new input modes.

The docstring update mentions "the incomplete config itself" which is unclear. Consider revising to explicitly document all three input modes:

Apply this diff to improve the docstring:

-    """Read the distillation yaml config file specified by ``args.export_kd_cfg``.
+    """Setup distillation configuration from various input sources.
 
     Args:
-        config_or_path: Path to user-defined distillation settings yaml file, or the incomplete config itself.
-            If `None`, uses default logits-only distillation mode for GPT models.
+        config_or_path: One of:
+            - `None`: Uses default logits-only distillation mode for GPT models.
+            - `DistillationConfig`: Uses the provided config instance directly.
+            - `str`: Path to a YAML file containing distillation settings.
         student_cfg: Model config for student model.
         teacher_cfg: Model config for teacher model.

115-118: Add error handling for file operations and YAML parsing.

The code opens a file and parses YAML without error handling. If the path doesn't exist, the file is malformed, or the YAML contains invalid fields for DistillationConfig, the function will raise an unhandled exception.

Apply this diff to add basic error handling:

     else:
-        with open(config_or_path) as f:
-            cfg = yaml.safe_load(f)
-        cfg = DistillationConfig(**cfg)
+        try:
+            with open(config_or_path) as f:
+                cfg_dict = yaml.safe_load(f)
+            cfg = DistillationConfig(**cfg_dict)
+        except FileNotFoundError:
+            raise FileNotFoundError(f"Distillation config file not found: {config_or_path}")
+        except yaml.YAMLError as e:
+            raise ValueError(f"Invalid YAML in distillation config: {e}")
+        except TypeError as e:
+            raise ValueError(f"Invalid distillation config fields: {e}")

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between cb44c55 and 5ca1e25.

📒 Files selected for processing (1)

modelopt/torch/distill/plugins/megatron.py (1 hunks)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)

GitHub Check: linux
GitHub Check: wait-checks / wait
GitHub Check: wait-checks / wait
GitHub Check: build-docs
GitHub Check: code-quality

🔇 Additional comments (1)

modelopt/torch/distill/plugins/megatron.py (1)

95-99: Rename verified: no remaining references to load_distillation_config. All internal callers are updated.

coderabbitai · 2025-10-02T18:06:46Z

modelopt/torch/distill/plugins/megatron.py

+    if config_or_path is None:
        logger.warning("Distillation config not provided. Using default.")
        cfg = DistillationConfig()
+    elif isinstance(config_or_path, DistillationConfig):
+        cfg = config_or_path
+    else:
+        with open(config_or_path) as f:
+            cfg = yaml.safe_load(f)
+        cfg = DistillationConfig(**cfg)


⚠️ Potential issue | 🟡 Minor

Consider the implications of mutating the input DistillationConfig.

When a DistillationConfig instance is passed in (line 114), the function directly assigns it to cfg and later mutates it by setting cfg.criterion (line 146) and cfg.loss_balancer (line 147). This means the caller's original config object will be modified, which may be unexpected if they intend to reuse the config.

Consider one of these approaches:

Document the mutation in the docstring:

config_or_path: One of: - `None`: Uses default logits-only distillation mode for GPT models. - - `DistillationConfig`: Uses the provided config instance directly. + - `DistillationConfig`: Uses the provided config instance directly (will be modified in-place). - `str`: Path to a YAML file containing distillation settings.

Create a copy to avoid mutating the input:

elif isinstance(config_or_path, DistillationConfig): - cfg = config_or_path + from copy import deepcopy + cfg = deepcopy(config_or_path)

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

if config_or_path is None:

logger.warning("Distillation config not provided. Using default.")

cfg = DistillationConfig()

elif isinstance(config_or_path, DistillationConfig):

cfg = config_or_path

else:

with open(config_or_path) as f:

cfg = yaml.safe_load(f)

cfg = DistillationConfig(**cfg)

if config_or_path is None:

logger.warning("Distillation config not provided. Using default.")

cfg = DistillationConfig()

elif isinstance(config_or_path, DistillationConfig):

from copy import deepcopy

cfg = deepcopy(config_or_path)

else:

with open(config_or_path) as f:

cfg = yaml.safe_load(f)

cfg = DistillationConfig(**cfg)

🤖 Prompt for AI Agents

In modelopt/torch/distill/plugins/megatron.py around lines 110 to 118, the function assigns a passed-in DistillationConfig instance directly to cfg and later mutates it, which unintentionally alters the caller's object; to fix this, create a shallow or deep copy of the incoming DistillationConfig (e.g., via copy.deepcopy or a provided copy/clone/from_dict constructor) and assign that copy to cfg before any mutations so the original remains unchanged, or alternatively document in the function docstring that the input config will be mutated if that behavior is intended.

codecov · 2025-10-02T18:15:54Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 73.79%. Comparing base (cb44c55) to head (5ca1e25).
⚠️ Report is 2 commits behind head on main.

Additional details and impacted files

@@           Coverage Diff           @@
##             main     #399   +/-   ##
=======================================
  Coverage   73.79%   73.79%           
=======================================
  Files         171      171           
  Lines       17591    17591           
=======================================
  Hits        12982    12982           
  Misses       4609     4609

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Signed-off-by: Asha Anoosheh <[email protected]>

Allow passing in DistillationConfig directly to setup fn

5ca1e25

Signed-off-by: Asha Anoosheh <[email protected]>

AAnoosheh self-assigned this Oct 2, 2025

AAnoosheh requested a review from a team as a code owner October 2, 2025 18:03

AAnoosheh requested a review from realAsma October 2, 2025 18:03

AAnoosheh requested a review from kevalmorabia97 October 2, 2025 18:04

kevalmorabia97 approved these changes Oct 2, 2025

View reviewed changes

AAnoosheh enabled auto-merge (squash) October 2, 2025 18:05

coderabbitai bot reviewed Oct 2, 2025

View reviewed changes

AAnoosheh merged commit adcb1a1 into main Oct 2, 2025
27 checks passed

AAnoosheh deleted the aanoosheh/kd-direct-config branch October 2, 2025 19:42

kevalmorabia97 pushed a commit that referenced this pull request Oct 3, 2025

Allow passing in DistillationConfig directly to setup fn (#399)

7f8acb5

Signed-off-by: Asha Anoosheh <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Allow passing in DistillationConfig directly to setup fn #399

Allow passing in DistillationConfig directly to setup fn #399

AAnoosheh commented Oct 2, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Oct 2, 2025 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Oct 2, 2025

Uh oh!

codecov bot commented Oct 2, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Allow passing in DistillationConfig directly to setup fn #399

Allow passing in DistillationConfig directly to setup fn #399

Conversation

AAnoosheh commented Oct 2, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Usage

Testing

Before your PR is "Ready for review"

Additional Information

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Oct 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Oct 2, 2025

Choose a reason for hiding this comment

Uh oh!

codecov bot commented Oct 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Uh oh!

AAnoosheh commented Oct 2, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Oct 2, 2025 •

edited

Loading

codecov bot commented Oct 2, 2025 •

edited

Loading